Classifying Documents with Poisson Mixtures

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Classifying Documents Without Labels

Automatic classification of documents is an important area of research with many applications in the fields of document searching, forensics and others. Methods to perform classification of text rely on the existence of a sample of documents whose class labels are known. However, in many situations, obtaining this sample may not be an easy (or even possible) task. Consider for instance, a set o...

متن کامل

Learning with Taxonomies: Classifying Documents and Words

Automatically extracting semantic information about word meaning and document topic from text typically involves an extensive number of classes. Such classes may represent predefined word senses, topics or document categories and are often organized in a taxonomy. The latter encodes important information, which should be exploited in learning classifiers from labeled training data. To that exte...

متن کامل

Classifying with Gaussian Mixtures and Clusters

In this paper, we derive classifiers which are winner-take-all (WTA) approximations to a Bayes classifier with Gaussian mixtures for class conditional densities. The derived classifiers include clustering based algorithms like LVQ and k-Means. We propose a constrained rank Gaussian mixtures model and derive a WTA algorithm for it. Our experiments with two speech classification tasks indicate th...

متن کامل

On Poisson–Tweedie mixtures

*Correspondence: [email protected] 1Department of Mathematics, Ohio University, Athens, OH, USA Full list of author information is available at the end of the article Abstract Poisson-Tweedie mixtures are the Poisson mixtures for which the mixing measure is generated by those members of the family of Tweedie distributions whose support is non-negative. This class of non-negative integer-valued ...

متن کامل

- 1 - Poisson Mixtures

Shannon (1948) showed that a wide range of practical problems can be reduced to the problem of estimating probability distributions of words and ngrams in text. It has become standard practice in text compression, speech recognition, information retrieval and many other applications of Shannon’s theory to introduce a ‘‘bag-of-words’’ assumption. But obviously, word rates vary from genre to genr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Transactions on Machine Learning and Artificial Intelligence

سال: 2014

ISSN: 2054-7390

DOI: 10.14738/tmlai.24.388